PyramidBox: A Context-assisted Single Shot Face Detector
نویسندگان
چکیده
Face detection has been well studied for many years and one of the remaining challenges is to detect small, blurred and partially occluded faces in uncontrolled environment. This paper proposes a novel context-assisted single shot face detector, named PyramidBox, to handle the hard face detection problem. Observing the importance of the context, we improve the utilization of contextual information in the following three aspects. First, we design a novel contextual anchor to supervise high-level contextual feature learning by a semi-supervised method, which we call it PyramidAnchors. Second, we propose the Low-level Feature Pyramid Network to combine adequate high-level contextual semantic feature and Low-level facial feature together, which also allows the PyramidBox to predict faces of all scales in a single shot. Third, we introduce a context-sensitive structure to increase the capacity of prediction network to improve the final accuracy of output. In addition, we use the method of Data-anchor-sampling to augment the training samples across different scales, which increases the diversities of training data for smaller faces. By exploiting the value of context, PyramidBox achieves superior performance among the state-of-the-art on the two common face detection benchmarks, FDDB and WIDER FACE.
منابع مشابه
Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network
Single Shot MultiBox Detector (SSD) is one of the fastest algorithms in the current object detection field, which uses fully convolutional neural network to detect all scaled objects in an image. Deconvolutional Single Shot Detector (DSSD) is an approach which introduces more context information by adding the deconvolution module to SSD. And the mean Average Precision (mAP) of DSSD on PASCAL VO...
متن کاملContext-aware Single-Shot Detector
SSD [18] is one of the state-of-the-art object detection algorithms, and it combines high detection accuracy with real-time speed. However, it is widely recognized that SSD is less accurate in detecting small objects compared to large objects, because it ignores the context from outside the proposal boxes. In this paper, we present CSSD– a shorthand for context-aware single-shot multibox object...
متن کاملLearning to identify video shots with people based on face detection
We examine how to identify video shots with at least two humans using only detected face information. While face detection is much more reliable than shape based people classification in broadcast video, one particular difficulty is that, when there are several humans in an image, the accuracy of face detection is usually significantly degraded, which leads to poor performance in identifying sh...
متن کاملDSSD : Deconvolutional Single Shot Detector
The main contribution of this paper is an approach for introducing additional context into state-of-the-art general object detection. To achieve this we first combine a state-ofthe-art classifier (Residual-101 [14]) with a fast detection framework (SSD [18]). We then augment SSD+Residual101 with deconvolution layers to introduce additional largescale context in object detection and improve accu...
متن کاملPicSOM Experiments in TRECVID 2008
Our experiments in TRECVID 2008 include participation in the high-level feature extraction, automatic search, video summarization, and video copy detection tasks, using a common system framework. In the high-level feature extraction task, we extended our last year’s experiments, which were based on SOM-based semantic concept modeling followed by a post-processing stage utilizing the concepts’ t...
متن کامل